Automatic prosodic segmentation by F0 clustering using superpositional modeling

نویسندگان

  • Mitsuru Nakai
  • Harald Singer
  • Yoshinori Sagisaka
  • Hiroshi Shimodaira
چکیده

In this paper, we propose an automatic method for detecting accent phrase boundaries in Japanese continuous speech by using F0 information. In the training phase, hand labeled accent patterns are parameterized according to a superpositional model proposed by Fujisaki, and assigned to some clusters by a clustering method, in which accent templates are calculated as centroid of each cluster. In the segmentation phase, automatic N-best extraction of boundaries is performed by One-Stage DP matching between the reference templates and the target F0 contour. About 90% of accent phrase boundaries were correctly detected in speaker independent experiments with the ATR Japanese continuous speech database.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CoPaSul Manual - Contour-based parametric and superpositional intonation stylization

The purposes of the CoPaSul toolkit are (1) automatic prosodic annotation and (2) prosodic feature extraction from syllable to utterance level. CoPaSul stands for contour-based, parametric, superpositional intonation stylization. In this framework intonation is represented as a superposition of global and local contours that are described parametrically in terms of polynomial coefficients. On t...

متن کامل

Detecting accent sandhi in Japanese using a superpositional F0 model

In this report, we propose a method for automatic prosodic structure recognition of Japanese utterances based on a superpositional F0 model, focusing particularly on the accent sandhi phonemenon in compound nouns. The method enables automatic labeling of F0 contours using the model, which can be useful for creating prosodic databases containing F0 contours in a parametric form. The prosodic str...

متن کامل

The use of F0 reliability function for prosodic command analysis on F0 contour generation model

This paper describes a method of utilizing an “F0 Reliability Field” (FRF), which we have proposed in our previous work, for estimating prosodic commands on F0 contour generation model. This FRF is the time-frequency representation of F0 likelihood, and an advantage of FRF is that it is not necessary to consider F0 errors that occur during an automatic F0 determination. Therefore, it is thought...

متن کامل

A targets-based superpositional model of fundamental frequency contours applied to HMM-based speech synthesis

Superpositional model of fundamental frequency (F0) contours as suggested by the Fujisaki model can well represent F0 movements of speech keeping a clear relation with linguistic information of utterances. Therefore, improvement of HMM-based speech synthesis is expected by using the merit of superpositional model. In this paper, a targets-based superpositional model is proposed in the light of ...

متن کامل

Quantification of Segmentation and F0 Errors and Their Effect on Emotion Recognition

Prosodic features modelling pitch, energy, and duration play a major role in speech emotion recognition. Our word level features, especially duration and pitch features, rely on correct word segmentation and F0 extraction. For the FAU Aibo Emotion Corpus, the automatic segmentation of a forced alignment of the spoken word sequence and the automatically extracted F0 values have been manually cor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995